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1.  Introduction 


One  of  the  goals  of  human  system  integrators  (HSI)  is  to  design  a  system  so  the  individuals 
operating  it  can  do  so  with  optimum  effectiveness.  A  contributing  factor  to  optimum 
effectiveness  is  the  amount  of  mental  workload  an  operator  experiences  while  operating  the 
system.  HSI  professionals  consider  an  optimally  designed  system  to  result  in  evenly  distributed 
manageable  workload.  Therefore,  mental  workload  level  is  an  indicator  of  optimum  system 
design  and  a  critical  design  parameter.  Furthermore,  it  is  a  design  parameter  that  should  be 
considered  early  in  the  design  process  when  problems  detected  are  less  expensive  to  correct  and, 
therefore,  more  likely  to  be  implemented.  To  achieve  the  goal  of  detecting  mental  workload 
issues  early  in  the  design  phase,  researchers  at  the  U.S.  Army  Research  Laboratory  (ARL)  have 
designed  the  Improved  Perfonnance  Research  Integration  Tool  (IMPRINT) 

(http  ://www .  arl .  army .  mi  1/IMPRINT) . 

IMPRINT  is  a  human  performance  modeling  tool  that  provides  HSI  professionals  with  the 
capability  of  predicting  the  impacts  of  mental  workload  on  the  perfonnance  of  the  human 
operators  of  a  system.  Using  IMPRINT,  HSI  professionals  represent  the  operators  of  a  system 
performing  tasks  with  the  system  equipment  to  accomplish  a  set  of  goals.  They  estimate  the 
mental  demands  these  tasks  impose  upon  the  operators  using  numeric  scales  embedded  within 
the  tool.  The  IMPRINT  software  then  predicts  the  overall  workload  of  the  operators  of  the 
system  and  identifies  any  potential  high  task  combinations.  When  IMPRINT  predicts  that  a 
particular  set  of  tasks  contributes  to  mental  overload,  system  designers  can  make  design  changes 
to  reduce  the  workload.  Once  they  make  the  changes,  analysts  can  model  the  system  again  in 
IMPRINT  to  see  if  they  do  indeed  reduce  workload.  Eventually,  however,  the  system  designers 
will  evaluate  the  system  design  within  a  laboratory  or  field  test. 

Because  mental  workload  is  a  critical  design  criterion,  when  the  developmental  testers  write  their 
test  and  evaluation  plan  for  a  new  system  design,  they  should  include  an  evaluation  of  the 
impacts  of  workload  on  performance  within  their  test  plan  as  criteria  for  system  effectiveness. 

To  achieve  this  goal,  they  can  include  in  the  plan  the  task  combinations  IMPRINT  predicted 
would  contribute  to  high  workload  and  then  evaluate  the  workload  level  and  associated 
performance.  To  do  this  effectively,  however,  it  is  important  that  the  evaluators  and  the 
IMPRINT  analysts  develop  a  procedure  that  ensures  the  test  evaluates  mental  workload  by  a 
methodology  comparable  with  the  mental  workload  model  within  the  IMPRINT  tool.  ARL  and 
Aberdeen  Test  Center  (ATC)  researchers  achieved  this  goal  within  the  Automated 
Communications  Analysis  of  Situation  Awareness  (ACASA)/IMPRINT/Joint  Warfighter  Test 
and  Training  Capability  (JWTTC)  experiment. 
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2.  Objectives 


The  ACASA/IMPRINT/JWTTC  test  had  several  goals.  The  first  and  main  objective  was  to 
collect  voice  communications  data  within  a  scenario  that  reflects  the  U.S  Army  Future  Combat 
System  (FCS)  operational  concept.  SA  Technologies  researchers  would  then  use  this  data  to 
develop  the  ACASA  tool.  A  second  objective  was  to  test  the  current  iteration  of  JWTTC 
instrumentation.  The  last  objective  was  to  collect  workload  data  compatible  with  IMPRINT 
workload  predictions  in  order  to  verify  IMPRINT  analytical  predictions  and  to  enhance  existing 
IMPRINT  models  of  FCS  crews.  This  report  documents  the  final  objective. 


3.  Methodology 


3.1  Mental  Workload 

To  compare  the  workload  ratings  from  the  ACASA/IMPRINT/JWTTS  test  with  IMPRINT 
workload  predictions,  the  researchers  needed  to  collect  workload  ratings  when  the  test 
participants  were  performing  task  combinations  that  would  match  task  combinations  that  had 
predictions  in  already  developed  FCS  IMPRINT  models.  This  workload  collection  technique 
was  required  in  order  for  the  experiment  to  match  the  theory  and  technique  IMPRINT  uses  to 
predict  workload. 

IMPRINT  predicts  workload  based  on  Wickens’  (1991)  Multiple  Resource  Theory  (MRT). 
According  to  MRT,  human  mental  resources  for  handling  tasks  are  limited.  When  an  individual 
is  required  to  perform  multiple  tasks  at  the  same  time,  he  or  she  is  utilizing  the  same  limited 
resources  for  the  concurrent  tasks.  This  combination  of  limited  cognitive  resources  and  multiple 
task  demands  may  result  in  high  workload  that,  in  turn,  may  lead  to  a  greater  number  of  errors, 
increased  task  time,  or  both. 

To  build  a  MRT  workload  analysis  in  IMPRINT,  analysts  begin  by  building  a  task-network 
model  to  represent  the  functions  and  tasks  individuals  perform  as  they  interact  with  the  system  to 
accomplish  a  set  of  goals  called  a  mission.  The  model  also  includes  the  equipment  or  interfaces 
each  individual  uses  for  each  task  within  the  mission.  Each  interface,  in  turn,  requires  the 
individual  using  it  to  use  one  or  more  of  four  mental  resources,  visual,  auditory,  cognitive  or 
psychomotor  when  completing  a  task.  To  quantify  the  amount  of  the  mental  resources  each 
individual  uses  to  complete  each  task  with  each  interface,  the  analysts  uses  four  behaviorally 
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anchored  rating  scales  embedded  into  the  IMPRINT  tool  (McCracken  and  Aldrich,  1984).  Each 
of  these  scales  represents  one  of  the  four  different  mental  resources  and  provides  the  IMPRINT 
user  with  a  consistent  method  for  entering  how  much  of  each  resource  the  human  uses  for  each 
task  he  or  she  performs  with  each  interface. 

Throughout  the  mission,  the  individuals  operating  the  system  will  use  multiple  interfaces  and 
equipment  to  perform  some  tasks  concurrently.  To  predict  the  workload  across  these  multiple 
tasks,  the  IMPRINT  software  has  an  algorithm  that  aggregates  the  workload  estimates  the 
analysts  selected  for  each  task.  The  IMPRINT  reports  display  this  overall  workload  number  for 
each  individual  each  time  a  new  task  begins  or  ends  in  the  mission.  Mitchell  (2000)  provides  a 
complete  description  of  the  workload  approach  in  IMPRINT. 

The  ARL  analysts  used  the  IMPRINT  workload  approach  to  build  a  model  to  represent  a  basic 
set  of  functions  and  their  associated  tasks  that  the  three  Soldiers  in  the  FCS  mounted  combat 
system  (MCS)  would  perfonn.  The  functions  included  within  this  model  were  battle  tracking, 
local  security,  communications  (crew  and  higher  headquarters),  driving,  target  engagement  and 
utilizing  unmanned  assets.  Because  the  IMPRINT  software  calculates  the  workload  for  specific 
combinations  of  functions  and  tasks,  the  researchers  conducting  the  ACASA/IMPRINT/JWTTC 
test  needed  to  collect  the  workload  data  for  a  set  of  tasks  similar  to  those  in  the  functions  of  the 
IMPRINT  MCS  model.  Collecting  data  from  the  similar  sets  of  tasks  would  permit  the  ARL 
analysts  to  compare  the  test  workload  data  to  the  IMPRINT  predicted  workload  data  for  the  same 
tasks.  However,  in  addition  to  similar  tasks,  the  workload  measures  collected  during  the  test 
must  indicate  the  level  of  workload  the  test  participant  was  experiencing  while  performing  the 
tasks.  The  researchers  could  then  compare  the  collected  workload  ratings  to  the  IMPRINT 
predictions.  To  meet  this  goal,  the  researchers  selected  multiple  workload  measures  to  collect 
workload  levels  during  the  experiment. 

3.1.1  Workload  Measures 

Researchers  typically  use  one  or  more  of  three  types  of  measures  to  collect  workload:  subjective 
measures,  physiological  measures  and  perfonnance-based  measures.  All  of  these  were  collected 
during  the  ACASA/IMPRINT/JWTTC  experiment. 

Subjective  workload  measures  are  “self-report”  workload  measures  because  with  this  technique 
individuals  rate  their  own  workload.  The  technique  assumes  that  an  individual  can  perceive  the 
amount  of  effort  they  are  using  to  complete  one  or  more  tasks.  The  researchers  used  a  modified 
version  of  the  Instantaneous  Self  Assessment  (ISA)  workload  ratings  (Kirwan  et  ah,  1997)  to 
collect  workload  during  the  experiment.  The  modified  ISA  scale  has  behaviorally  anchored 
descriptions  of  varying  levels  of  workload  on  a  simple  1  to  5  scale.  The  experiment  was  paused 
several  times  in  order  to  collect  data  for  another  tool,  ACASA.  During  these  pauses,  the  test 
participants  gave  ISA  ratings.  Table  1  shows  the  ISA  scale  used  during  the  experiment. 
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Table  1.  Instantaneous  self-assessment  of  workload. 


Circle  how  much  workload  you  were  feeling  in  your  mind  at  the  time  the  simulation  was  paused. 

1  2  3  4  5 

1  =  Nothing  to  do.  Rather  boring. 

2  =  More  than  enough  time  for  all  tasks. 

3  =  All  tasks  going  well.  Busy  but  exciting  speed  of  tasks.  Could  keep  going  always  at  this  pace. 

4  =  Less  important  tasks  suffering.  Could  not  work  at  this  level  very  long. _ 

5  =  Behind  on  tasks.  Losing  track  of  the  full  picture. _ 


3. 1 . 1 .2  Physiological  Workload  Measures.  One  of  the  ATC  test  objectives  was  to  develop 
physiological  workload  measurement  techniques.  Physiological  measures  assume  that  evaluators 
can  assess  mental  workload  by  the  individual’s  level  of  certain  physiological  measures  such  as 
eye  tracking,  electrocardiogram  (EKG),  electro-encephalogram  (EEG),  and  galvanic  skin 
response  (GSR)  measures.  During  the  test,  ATC  evaluators  instrumented  two  test  participants 
and  collected  the  EEG,  EKG,  and  GSR  data.  They  would  use  this  data  to  identity  times  when 
the  test  participants’  physiological  data  changed  in  response  to  events  in  the  experiment.  Once 
they  knew  what  events  triggered  physiological  changes,  they  could  identity  when  these 
physiological  changes  indicated  the  participants  were  experiencing  high  workload  to  see  if  the 
high  workload  events  matched  the  tasks  in  the  IMPRINT  workload  predictions. 

3. 1 . 1 .3  Performance-Based  Workload  Measures.  The  underlying  assumption  for  performance- 
based  measures  of  workload  is  that  as  a  task  becomes  more  difficult  for  a  person  to  perform  or  as 
the  person  perfonns  more  simultaneous  tasks  workload  increases.  The  higher  the  individuals’ 
workload  numbers  are  the  more  likely  they  are  to  experience  perfonnance  problems  due  to 
workload.  Performance-based  workload  measures  collected  during  the  test  were  designed  by  SA 
Technology  researchers  to  assess  the  participants’  awareness  of  the  platoon  situation  at  a  specific 
time  in  the  test.  Specifically,  at  preplanned  pauses  in  the  simulation,  the  test  participants 
recorded  a  number  of  key  mission  data  points.  For  example,  they  identified  the  number  of 
buildings  the  platoon’s  unmanned  ground  vehicle  (UGV)  entered;  the  number  of  IEDS  the 
platoon  encountered  up  to  the  current  pause  in  the  mission;  the  number  of  insurgent  attacks  the 
platoon  encountered  since  the  last  pause.  The  SA  Tech  researchers  compared  their  written 
answers  to  the  actual  data  as  indicators  of  performance.  The  IMPRINT  analysts  used  these 
measures  to  indicate  performance  accuracy  and  compared  them  to  the  IMPRINT  high  workload 
predictions  to  see  if  performance  declined  during  the  task  combinations  for  which  IMPRINT 
predicted  it  would  decline. 

3.2  Situation  Awareness  Measures 

During  the  ACASA/IMPRINT/JWTTS  test,  the  SA  Technology  researchers  collected 
performance  on  secondary  tasks,  such  as  dismount  threats  and  SA  questions.  The  test 
participants  recorded  this  data  during  the  pauses.  It  provided  the  SA  Technology  researchers 
with  measures  of  situation  awareness,  mental  workload,  and  performance.  Effective  response  to 
a  dismount  threat  was  indicative  of  good  situation  awareness  and  reasonable  workload,  and  was 
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evidence  of  good  performance  (Abounader  et  al,  2008).  Slow  or  ineffective  response  to  a 
dismount  threat  was  indicative  of  poor  situation  awareness  or  cognitive  overload,  and  was 
evidence  of  poor  performance  (Abounader  et  al.,  2008).  A  quick,  correct  response  to  an  SA 
probe  was  indicative  of  good  situation  awareness  and  reasonable  workload,  while  an  incorrect  or 
non-existent  response  to  an  SA  probe  was  indicative  of  poor  situation  awareness  or  cognitive 
overload  (Abounader  et  al.,  2008).  A  correct  response  was  a  predictor  of  good  performance,  while 
an  incorrect  response  was  a  predictor  of  poor  performance.  ATC  personnel  integrated  these 
measures  into  the  scenarios,  where  they  provided  situation  awareness,  workload,  and  perfonnance 
data  for  the  SA  Technology  researchers,  as  well  as,  the  ATC  and  IMPRINT  researchers. 

In  addition  to  the  performance-based  SA  measures,  the  SA  Technology  researchers  collected 
subjective  SA  assessments  using  a  post  trial  participant  subjective  situation  awareness 
questionnaire  (PSAQ)  (Abounader  et  al.,  2008). 

3.3  Test  Participants 

Participants  were  six  soldiers  assigned  to  the  test  and  evaluation  group  at  ATC.  The  ATC 
researcher  assigned  each  of  the  participants  to  one  of  two  three-member  teams.  Each  team 
represented  the  crew  of  an  FCS  MCS  platform.  One  team  represented  the  Platoon  Leader’s  (PL) 
MCS  and  the  other  the  Platoon  Sergeant’s  (PSG)  MCS.  Table  2  displays  the  FCS  MCS  vehicles 
and  player  positions  and  roles.  A  role  determines  the  infonnation  displayed  to  Soldiers  so  they 
can  achieve  the  goals  of  a  particular  military  position  or  job.  Some  positions  consist  of  multiple 
roles  because  the  Soldiers  required  informational  needs  for  several  roles. 


Table  2.  FCS  MCS  positions  and  roles  represented  in  simulation. 


MCS  Vehicle  1 

Position 

Rank 

FCS  Roles 

Platoon  Leader/V ehicle  Commander 

Ol 

Crewmember 

MCS  Company  Platoon  Leadership 
Robotics  Technician 

Vehicle  Commander 

Crew  Chief 

E5 

Crewmember 

Robotics  Technician 

Driver 

E4 

Crewmember 

Driver 

MCS  Vehicle  2 

Position 

Rank 

FCS  Roles 

Platoon  Sergeant/Vehicle  Commander 

E7 

Crewmember 

MCS  Company  Platoon  Leadership 
Robotics  Technician 

Vehicle  Commander 

Crew  Chief 

E5 

Crewmember 

Robotics  Technician 

Driver 

E4 

Crewmember 

Driver 
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The  test  participants  represented  MCS  crews  because  ARL  analysts  had  modeled  the  Platoon 
Leader’s  MCS  vehicle  in  IMPRINT  and  made  predictions  on  high  workload  task  combinations. 
Thus,  the  analysts  could  compare  the  workload  predictions  for  the  MCS  crews  in  IMPRINT  with 
the  MCS  crews  in  the  experiment.  In  order  to  compare  the  workload  predictions  from  the  model 
to  those  in  the  experiment,  the  participants  had  to  perform  experimental  tasks  similar  to  the 
IMPRINT  model  tasks. 

3.4  Test  Tasks 

For  the  FCS  MCS  crew  analyses,  the  IMPRINT  analyses  predicted  battle  tracking,  local  security 
tasks,  utilizing  unmanned  assets,  driving  and  communications  would  result  in  high  workload 
when  combined  with  standard  mission  execution  tasks  (Mitchell,  2005).  Therefore,  the  test 
participants  performed  these  tasks  during  the  experiment.  For  the  experiment,  the  researchers 
defined  each  of  these  tasks  by  specific  observable  behaviors  and  performance  measures. 

1 .  Battle  Tracking. 

a.  Building  orders  and  graphics  on  display 

b.  Watching  current  operations  on  display 

c.  Looking  at  display  for  possible  threats 

d.  Marking  target  identification  on  display 

e.  Looking  at  unmanned  aerial  vehicle  location  on  display 

f.  Looking  at  unmanned  ground  vehicle  location  on  display 

At  specified  times  the  test  participant  was  required  to  identify  and  report  certain  pieces  of 
information.  Whether  or  not  the  test  participant  identified  and  reported  the  data  correctly  was  the 
performance  accuracy  measure. 

2.  Fire  Missions. 

a.  Looking  at  display  for  possible  threats 

b.  Firing  at  line-of-sight  target 

c.  Firing  beyond-line-of-sight  mission 

d.  Checking  ammunition  status 

e.  Checking  damage  to  target 

ATC  contract  personnel  recorded  on  video  and  ARL  personnel  recorded  observational  data  on 
the  targets  identified,  targets  missed,  false  reports,  number  of  line-of-sight  missions,  battle 
damage  assessment  reports,  spot  reports,  and  ammunition  reports.  Comparison  of  information  in 
reports  to  correct  information  was  the  performance  accuracy  measure. 
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3.  Monitoring  communications  within  the  MCS,  between  MCSs  and  from  headquarters  (voice 
communic  ations) . 

a.  Listening  for  messages  from  company 

b.  Hearing  a  voice  message  from  company 

c.  Sending  a  voice  message  to  company 

d.  Listening  for  voice  messages  from  your  platoon 

e.  Hearing  a  voice  message  from  your  platoon 

f.  Saying  a  voice  message  to  your  platoon 

g.  Saying  something  to  someone  in  your  vehicle 

h.  Listening  to  someone  in  your  vehicle 

Test  participants  must  respond  to  voice  messages  addressed  to  their  call  sign.  ATC  contract 
personnel  recorded  on  video,  the  number  of  times  messages  were  sent/spoken  before  responses 
occurred.  Accuracy  of  response  was  observed  and  recorded  (accuracy  measure  is  dependent  on 
type  of  message). 

4.  Monitor  unmanned  assets. 

a.  Reporting  location  of  unmanned  asset 

b.  Looking  at  information  from  unmanned  ground  vehicle  sensor 

c.  Reporting  location  of  unmanned  aerial  vehicle 

d.  Looking  at  information  from  unmanned  aerial  vehicle 

e.  Tele-operating  unmanned  ground  vehicle 

f.  Tele-operating  unmanned  aerial  vehicle 

The  test  participants  must  monitor  the  status  of  unmanned  assets  attached  to  the  unit.  Unmanned 
asset  may  be  moving.  Researchers  asked  the  test  participants  the  status  of  their  unmanned  asset. 

5.  Navigating. 

a.  Maintaining  route 

b.  Driving 

c.  Watching  the  driver’s  driving 

ATC  contract  personnel  recorded  the  number  of  crashes  and  driver  behavior  on  video  and  ARL 
personnel  recorded  observational  data. 
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3.5  Test  Scenario 


ATC  and  ARL  personnel  conducted  the  test  in  ATC  building  400B.  When  the  test  participants 
arrived,  the  researchers  gave  an  overview  of  the  test  and  asked  them  to  fill  out  informed  consent 
forms  and  a  demographic  questionnaire.  Next,  the  researchers  assigned  the  participants  to  one  of 
three  player  roles  (table  2)  and  the  participants  practiced  using  the  simulation  in  their  assigned 
roles.  On  days  2  and  3  of  the  test,  the  participants  perfonned  eight  trials  (scenario  runs)  across 
two  full  8-h  days  of  data  collection  (refer  to  table  3).  Data  collection  was  continuous  throughout 
the  scenario  runs.  Day  4  was  reserved  for  repeating  any  scenarios,  if  needed,  and  conducting  an 
After- Action  Review,  which  will  include  debriefing  participants  and  answering  any  questions 
they  have  about  the  experiment. 


Table  3.  Daily  test  schedule. 


Monday 

Morning/Afternoon 

Train  participants  on  the  simulation  and  their  FCS  roles 

Tuesday 

Morning 

First  and  second  mission  threads 

Afternoon 

Third  and  fourth  mission  threads 

Wednesday 

Morning 

Fifth  and  sixth  mission  threads  (VIP  day) 

Afternoon 

Seventh  and  eighth  mission  threads  (VIP  day) 

Thursday 

Morning 

Repeat  aborted  or  omitted  mission  thread  if  necessary,  otherwise  AAR 

Afternoon 

AAR  if  not  completed  in  the  morning,  otherwise  nothing  scheduled 

Across  the  week  of  testing,  the  participants  played  the  roles  of  MCS  crews  in  four  desert 
scenarios  and  four  urban  scenarios.  A  computer  simulation  represented  each  scenario  and 
included  a  different  type  of  mission  as  follows: 

•  Desert  no.  1  -  Detect  IEDs  on  main  supply  route  and  secondary  routes. 

•  Desert  no.  2  -  Find  insurgents  in  specific  villages;  use  UGV  to  search  buildings  as 
required 

•  Desert  no.  3  -  Surveillance  of  outdoor  market;  respond  to  suspicious  activity 

•  Desert  no.  4  -  Locate  weapons  cache  at  night 

•  Urban  no.  1  -  Detect  IEDs  on  urban  roads 

•  Urban  no.  2  -  Clear  buildings  in  specified  area 

•  Urban  no.  3  -  Locate  and  follow  a  dismounted  insurgent 

•  Urban  no.  4  -  Locate  weapons  cache  at  night 

During  the  missions,  the  test  participants  collaborated  using  voice  communication  as  well  as 
interacting  with  events  in  the  simulation.  ATC  personnel  paused  the  simulation  at  specific  times 
to  allow  testers  to  gather  data. 
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3.6  Simulation 

The  simulation  engine  driving  the  test  scenarios  was  based  on  UNREAL,  a  COTS  first-person 
shooter  computer  game  developed  by  Epic  Games  and  Digital  Extremes  and  published  by  GT 
Interactive  (now  owned  by  Atari)  in  1998.  ATC  personnel  modified  the  UNREAL  simulation  to 
meet  the  data  collection  requirements  of  this  experiment.  Specifically  they  modified  the 
simulation  to  record  the  following  to  the  log  file: 

1.  Scenario  events  (e.g.,  injects,  ‘actions’  of  friendly  and  enemy  units); 

2.  Participant  inputs  (e.g.,  button  presses); 

3.  Every  30  s  log  output  of  vehicle  and  inject  outcomes; 

4.  Ammunition  use/levels  during  scenario; 

5.  Time-stamped  events  and  participant  inputs  from  the  same,  central  reference  clock’ 

6.  Provided  experimenter  access  to  bird’s-eye  view  of  map  during  scenario; 

7.  Provided  out-the -window  view  for  driver. 

3.7  Data  Collection 

The  IMPRINT  analysts  collected  observational  data  of  the  Platoon  Leader  and  Platoon  Sergeant 
throughout  the  experiment.  They  documented  in  writing  the  major  activities,  communications, 
and  times  for  these  two  positions.  ATC  personnel  recorded  on  video  the  activities  of  the  Platoon 
Leader  and  Platoon  Sergeant.  In  addition,  they  recorded  all  voice  communications  and  the 
physiological  EEG,  ECG  and  GSR  measures  for  the  platoon  leader  and  platoon  sergeant.  During 
simulation  pauses,  the  test  participants  completed  performance  measure  queries  related  to 
situation  awareness  and  gave  ISA  workload  ratings.  At  the  end  of  the  experiment,  they 
completed  McCracken  and  Aldrich  (1984)  workload  scales  because  these  scales  are  the  basis  of 
the  workload  measures  in  IMPRINT.  The  simulation  software  logged  the  relevant  scenario 
events,  such  as  time  of  pauses,  and  participants’  actions,  including  use  of  controls  and  weapons. 
The  simulation  logged  the  location  of  the  participants,  enemies,  unmanned  vehicles,  and 
weapons  every  10  s  to  establish  ground-truth.  Additionally,  all  events,  including  the  appearance 
of  enemy  threats  and  the  firing  of  weapons,  were  logged  and  time-stamped.  At  the  end  of  each 
scenario,  the  simulation  generated  summary  data  including  total  kills  and  shots  fired. 

Additionally,  participants  were  video  recorded.  The  primary  purpose  of  video  recording  was  to 
aid  in  debriefing  or  after  action  interviews.  Additionally,  the  ATC  researchers  used  the  video 
recordings  to  help  synchronize  communication,  physiological,  and  simulation  data  in  the  event 
that  clock  times  and  events  did  not  line  up  properly. 
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4.  Data  Analysis 


Tables  4-9  display  the  workload  ratings  from  the  ISA  scale,  the  PSAQ  ratings,  and  the  functions 
performed  by  each  test  participant  averaged  across  the  pauses  within  each  mission  across  the 
eight  missions  completed  during  the  test. 


Table  4.  Platoon  leader’s  self-report  ratings  and  frequent  functions  performed  across  eight  missions. 


Mission 

ISA 

PSAQ 

Most  Frequent  Function 

How  Hard 
Working 

How  Well 
Performing 

How  Aware 

1 

All  tasks  going 

well 

Somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Operations 

2 

All  tasks  going 
well  then  more 
than  enough  time 
for  all  tasks 

Between  not  at 
all  hard  to 
somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Fire  Mission 

3 

More  than 
enough  time  for 
all  tasks 

Somewhat 

hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Unmanned  Asset  Operations 

4 

All  tasks  going 

well 

Between 
somewhat  hard 
and  very  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 

5 

All  tasks  going 

well 

Between 
somewhat  hard 
and  very  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Not  recorded 

6 

All  tasks  going 

well 

Between 
somewhat  hard 
and  very  hard 

Average 

Somewhat 

aware 

Voice  Communications 

Fire  Mission 

7 

More  than 
enough  time  for 
all  tasks 

Somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 

8 

More  than 
enough  time  for 
all  tasks  then  all 
tasks  going  well 

Between 
somewhat  hard 
and  very  hard 

Average 

Somewhat 

aware 

Voice  Communications 

Fire  Mission 

Unmanned  Asset  Operations 
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Table  5.  Crew  chief  of  the  platoon  leader’s  vehicle  self-report  ratings  and  frequent  functions  performed  across 
eight  missions. 


Mission 

ISA 

PSAQ 

Most  Frequent  Function 

How  Hard 
Working 

How  Well 
Performing 

How  Aware 

1 

More  than 
enough  time  for 
all  tasks  then 
nothing  to  do 

Somewhat  hard 

Between  average 
to  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Fire  Mission 

2 

Nothing  to  do 

Not  at  all  hard 

Very  well 

V  ery  aware 

Nothing 

3 

No  data 

No  data 

No  data 

No  data 

No  data 

4 

No  data 

No  data 

No  data 

No  data 

Fire  Mission 

5 

All  tasks  going 

well 

Somewhat  hard 

Very  well 

V ery  aware 

N  othing 

6 

All  tasks  going 

well 

Somewhat  hard 

Between  average 
to  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Fire  Mission 

Unmanned  Asset  Operations 

7 

Nothing  to  do 

Not  at  all  hard 

Between  average 
to  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

8 

Nothing  to  do 
then  all  tasks 
going  well 

Somewhat  hard 

Very  well 

Very  aware 

Voice  Communications 

Fire  Mission 
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Table  6.  Driver  of  the  platoon  leader’s  vehicle  self-report  ratings  and  frequent  functions  performed  across  eight 
missions. 


Mission 

ISA 

PSAQ 

Most  Frequent  Function 

How  Hard 
Working 

How  Well 
Performing 

How  Aware 

1 

All  tasks  going 

well 

Somewhat  hard 

Between  average 
and  very  well 

Very  aware 

Not  collected 

2 

All  tasks  going 

well 

Somewhat  hard 

Very  well 

Very  aware 

Not  collected 

3 

Less  important 
tasks  suffering 

Not  at  all  hard 

Average 

Very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

4 

Less  important 
tasks  suffering 
then  all  tasks 
going  well 

Somewhat  hard 

Average 

Very  aware 

Voice  Communications 

5 

All  tasks  going 

well 

Very  hard 

Between  average 
and  very  well 

Very  aware 

Nothing 

6 

All  tasks  going 
well  then  less 
important  tasks 
suffering  then  all 
tasks  going  well 

Somewhat  hard 

Average 

V  ery  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Driving 

7 

Nothing  to  do 
then  less 
important  tasks 
suffering 

Not  at  all  hard 

Average 

Very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Driving 

8 

All  tasks  going 

well 

Very  hard 

Between  average 
and  very  well 

Very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Driving 
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Table  7.  Platoon  sergeant’s  self-report  ratings  and  frequent  functions  performed  across  eight  missions. 


Mission 

ISA 

PSAQ 

Most  Frequent  Function 

How  Hard 
Working 

How  Well 
Performing 

How  Aware 

1 

All  tasks  going 

well 

Somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 

2 

All  tasks  going 

well 

Somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 

3 

All  tasks  going 

well 

Between 
somewhat  hard 
and  very  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Unmanned  Asset  Operations 

4 

All  tasks  going 

well 

Somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Unmanned  Asset  Operations 

5 

All  tasks  going 

well 

Between 
somewhat  hard 
and  very  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 

6 

All  tasks  going 

well 

Somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 

7 

All  tasks  going 

well 

Between  not  at 
all  hard  to 
somewhat  hard 

Average 

Between  not  at 
all  aware  and 
somewhat  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 

8 

All  tasks  going 

well 

Between 
somewhat  hard 
and  very  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Battle  Tracking 

Fire  Mission 

Unmanned  Asset  Operations 
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Table  8.  Crew  chief  of  platoon  sergeant’s  vehicle  self-report  ratings  and  frequent  functions  performed  across 
eight  missions. 


Mission 

ISA 

PSAQ 

Most  Frequent  Function 

How  Hard 
Working 

How  Well 
Performing 

How  Aware 

1 

More  than 
enough  time  for 
all  tasks  then 
nothing  to  do 

Not  at  all  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Fire  Mission 

2 

Nothing  to  do 

Not  at  all  hard 

Very  well 

Between 
somewhat  aware 
and  very  aware 

Fire  Mission 

3 

More  than 
enough  time  for 
all  tasks 

Between  not  at 
all  hard  to 
somewhat  hard 

Very  well 

Between 
somewhat  aware 
and  very  aware 

Not  recorded 

4 

All  tasks  going 

well 

Between  not  at 
all  hard  to 
somewhat  hard 

Very  well 

Between 
somewhat  aware 
and  very  aware 

Not  recorded 

5 

More  than 
enough  time  for 
all  tasks  to  all 
tasks  going  well 

Somewhat  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Not  recorded 

6 

More  than 
enough  time  for 
all  tasks  to  all 
tasks  going  well 

Somewhat  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Not  recorded 

7 

Nothing  to  do 
then  all  tasks 
going  well 

Between  not  at 
all  hard  to 
somewhat  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Not  recorded 

8 

All  tasks  going 

well 

Between 
somewhat  hard 
and  very  hard 

Between  average 
and  very  well 

Between 
somewhat  aware 
and  very  aware 

Voice  Communications 

Fire  Mission 
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Table  9.  Driver  of  platoon  sergeant’s  vehicle  self-report  ratings  and  frequent  functions  performed  across  eight 
missions. 


Mission 

ISA 

PSAQ 

Most  Frequent  Function 

How  Hard 
Working 

How  Well 
Performing 

How  Aware 

1 

Nothing  to  do 

Not  at  all  hard 

Very  well 

Very  aware 

Driving 

2 

Nothing  to  do 

Not  at  all  hard 

Very  well 

Very  aware 

Driving 

Fire  Mission 

3 

More  than 
enough  time  for 
all  tasks  then 
nothing  to  do 

Not  at  all  hard 

Very  well 

Very  aware 

Voice  Communications 

4 

More  than 
enough  time  for 
all  tasks 

Not  at  all  hard 

Very  well 

Very  aware 

N  othing 

5 

More  than 
enough  time  for 
all  tasks  then  all 
tasks  going  well 

Not  at  all  hard 

Average 

Very  aware 

Nothing 

6 

More  than 
enough  time  for 
all  tasks  then 
nothing  to  do 

Not  at  all  hard 

Average 

Very  aware 

Voice  Communications 

7 

N  othing  to  do 

Not  at  all  hard 

Average 

Very  aware 

Nothing 

8 

More  than 
enough  time  for 
all  tasks 

Not  at  all  hard 

Average 

Between 
somewhat  aware 
and  very  aware 

Nothing 

As  tables  4-9  show,  the  primary  function  performed  by  all  players  throughout  the  test  was  voice 
communications.  Therefore,  table  10  shows  the  communications  frequency  throughout  the  test 
by  player. 

In  addition  to  the  completing  the  communications  data  analysis  results  in  table  10,  the 
researchers  compared  the  times  for  the  voice  communication  data  collected  during  the 
experiment  and  the  speech  micro-model  predictions  in  IMPRINT.  In  order  to  perform  this 
comparison,  they  counted  each  word  from  the  voice  messages  in  all  eight  missions.  They  then 
used  to  the  word  counts  to  calculate  the  predicted  verbal  communications  in  IMPRINT.  Finally, 
they  compared  the  times  from  the  test  data  to  the  IMPRINT  predictions  to  determine  the 
relationship,  if  any,  that  existed  between  the  two  data  sets.  The  relationship  between  actual  voice 
communication  (as  measured  by  the  MissionTime)  and  predicted  voice  communication  (as 
measured  by  the  IMPRINT  Time)  was  investigated  using  the  Pearson  product-moment 
correlation  coefficient.  The  analysts  found  that  there  was  a  strong,  positive  correlation  between 
the  two  variables  [r  =  0.649,  n  =  3874,  p  <0.0005].  They  calculated  the  correlation  using  SPSS 
v.15.  Table  1 1  shows  the  results  of  the  calculation. 
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Table  10.  Communication  duration  and  frequency  by  position. 


POSITION 

CO 

PL 

PLG 

PLD 

PSGT 

PSGG 

PSD 

All 

CO 

Mean  (milliseconds) 

5972 

1125 

955 

7170 

689 

4031 

7443 

No.  of  Msgs. 

41 

<1 

<1 

28 

5 

5 

22 

PL 

Mean  (milliseconds) 

8808 

4655 

9780 

9122 

661 

8526 

7714 

No.  of  Msgs. 

50 

15 

22 

16 

1 

8 

19 

PLG 

Mean  (milliseconds) 

542 

4043 

2797 

0 

1219 

1382 

2991 

No.  of  Msgs. 

<1 

20 

23 

0 

<1 

1 

6 

PLD 

Mean  (milliseconds) 

510 

6574 

4884 

0 

622 

695 

6657 

No.  of  Msgs. 

1 

29 

16 

0 

<1 

1 

6 

PSGT 

Mean  (milliseconds) 

3052 

4190 

0 

0 

2172 

3359 

5071 

No.  of  Msgs. 

34 

17 

0 

0 

4 

18 

8 

PSGG 

Mean  (milliseconds) 

1664 

2720 

0 

452 

3893 

3288 

3143 

No.  of  Msgs. 

2 

1 

0 

<1 

7 

9 

2 

PSGD 

Mean  (milliseconds) 

2919 

2787 

1734 

1659 

4392 

3519 

2181 

No.  of  Msgs. 

1 

8 

1 

24 

17 

0 

5 

16 


Table  11.  Correlation  between  IMPRINT  speech  model  times  and  test  data. 


MissionTime 

IMPRINTTime 

Mission  Time 

Pearson  Correlation 

1 

0.649a 

— 

Sig.  (2-tailed) 

— 

0.000 

— 

N 

3875 

3874 

IMPRINTTime 

Pearson  Correlation 

0.649a 

1 

— 

Sig.  (2-tailed) 

0.000 

— 

— 

N 

3874 

3874 

“Correlation  is  significant  at  the  0.01  level  (2-tailed). 


4.1  Discussion  of  Results 

4.1.1  Overview 

The  data  analysis  in  this  report  focused  on  the  ARL  researchers’  observational  data,  the  ISA 
ratings,  and  SA  ratings.  SA  Technology  researches  and  ATC  physiological  researchers  are 
analyzing  their  own  data  sets  respectively. 

The  collection  of  voice  communications  for  the  ACASA  tool  was  the  primary  test  objective.  To 
meet  this  objective,  the  test  designers  had  intentionally  developed  a  scenario  that  would  generate 
frequent  voice  message  traffic.  For  example,  although  FCS  vehicles  and  dismounts  have  digital 
communications  capability,  the  test  participants  communicated  by  voice.  Therefore,  as  tables  4- 
9  show,  sending  and  responding  to  voice  message  traffic  either  alone  or  in  combination  with 
other  tasks  became  the  most  frequent  mission  task  perfonned  by  most  of  the  test  participants. 
Although  the  primary  test  objective  may  have  influenced  the  rate  of  voice  communications, 
communications  monitoring  and  responding  are  tasks  perfonned  frequently  by  FCS  vehicle 
crews  and  dismounted  Soldiers.  For  this  reason,  these  tasks  are  included  in  every  IMPRINT 
model  the  ARL  researchers  have  built  to  represent  FCS  platform  crews  and  dismounted  Soldiers. 
Therefore,  the  ARL  researchers  were  able  to  compare  some  of  their  IMPRINT  predictions 
related  to  communications  tasks  with  the  test  data. 

In  the  analysis  of  the  results  from  their  FCS  models,  the  ARL  researchers  had  predicted  that 
combining  communications  tasks  with  other  tasks  would  increase  workload  to  a  level  that  would 
contribute  to  decrements  in  the  FCS  Soldiers’  perfonnance  (Mitchell  and  Brennan,  in  review; 
Mitchell,  2005).  Observational  data  results  from  the  ACASA  experiment  are  consistent  with  the 
IMPRINT  predictions.  For  example,  in  the  ACASA  experiment,  the  platoon  leader  experienced 
communications  perfonnance  decrements  when  he  combined  communications  tasks  with  battle 
tracking.  Specifically,  he  missed  a  grid  location  given  out  in  the  voice  message  because  he  was 
already  trying  to  locate  a  grid  on  the  map.  He  gave  the  incorrect  grid  coordinates  to  his  platoon 
and  missed  another  voice  message  while  trying  to  determine  the  conect  grids.  He  delayed 
responding  to  a  message  from  the  company  commander  when  he  was  tracking  items  on  the  map. 
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Although  the  test  results  for  performance  on  communications  tasks  appear  to  be  consistent  with 
IMPRINT  predictions,  the  ARL  researchers  did  not  have  enough  ISA  workload  ratings  to 
correlate  the  workload  ratings  from  the  test  with  the  workload  predictions  from  their  IMPRINT 
FCS  models.  To  compare  the  workload  predictions  from  their  FCS  IMPRINT  analyses  with  the 
test  data,  the  ARL  researchers  needed  to  know  what  task  combinations  the  test  participants  were 
performing  when  they  gave  a  workload  rating.  This  data  was  not  available.  The  primary 
objective  of  the  test  was  to  collect  voice  communications  data  within  a  scenario  that  reflects  the 
FCS  operational  concept  for  development  of  the  ACASA  tool.  To  meet  this  objective  the  SA 
Technology  researchers  needed  SA  and  performance  data  collected  that  required  pausing  of  the 
simulation.  The  test  designers  decided  that  the  test  participants  would  do  the  ISA  workload 
ratings  during  the  pauses  as  well.  The  rationale  for  this  decision  was  to  ensure  that  the  ACASA 
data  which  was  the  primary  test  goal  was  unaffected  by  the  IMPRINT  data  collection  needs 
which  were  a  secondary  test  objective.  The  IMPRINT  data  collection  would  have  required  the 
test  participants  to  give  more  frequent  workload  ratings.  The  IMPRINT  analysts  needed  more 
frequent  workload  ratings  because  they  needed  to  identify  how  workload  ratings  varied  with 
specific  task  combinations.  Because  of  this  decision,  the  IMPRINT  analysts  had  difficulty  in  the 
data  analysis  detennining  which  specific  task  combinations  correlated  with  the  Soldiers’ 
workload  ratings.  To  meet  this  challenge,  the  analysts  reviewed  their  observational  data  and 
identified  which  broad  task  categories  or  functions,  each  crewmember  performed  in  the  time 
interval  prior  to  a  scheduled  simulation  pause.  They  then  paired  an  ISA  workload  rating  given 
during  the  pause  in  the  simulation  required  for  ACASA  data  collection  with  the  functions  in  the 
interval  prior  to  the  pause. 

4.1.2  Platoon  Leader  Functions  and  Workload  Ratings 

The  specific  tasks  the  platoon  leader  performed  across  the  eight  missions  were  various 
combinations  of  voice  communications,  battle  tracking,  fire  mission  tasks,  and  unmanned  asset 
operations.  Across  the  eight  missions  in  the  test,  the  platoon  leader  reported  via  his  ISA  ratings 
that  his  workload  level  “permitted  more  than  enough  time  for  all  his  tasks”  and  that  “all  tasks 
were  going  well.”  For  the  PSAQ,  he  reported  he  was  working  “somewhat  hard  to  very  hard” 
while  perfonning  these  tasks  for  most  of  the  missions.  He  thought  this  effort  resulted  in 
“between  average  to  very  well  performance”  with  “somewhat  to  very  aware  situation  awareness 
ratings”  on  most  missions.  The  platoon  leader’s  workload  ratings,  unlike  the  IMPRINT 
predictions,  indicate  his  workload  level  was  not  high  during  the  missions.  However,  his 
performance  was  consistent  with  the  IMPRINT  performance  predictions  of  a  performance 
decrement.  For  example,  although  he  reported  his  performance  for  the  first  mission  was 
“average  to  very  well,”  he  did  experience  performance  errors  while  monitoring  voice 
communications  concurrent  with  battle  tracking.  Specifically  he  missed  a  grid  location  given  out 
in  the  voice  message  because  he  was  already  trying  to  locate  a  grid  on  the  map.  He  gave  the 
incorrect  grid  coordinates  to  his  platoon  and  missed  another  voice  message  while  trying  to 
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determine  the  correct  grids.  He  delayed  responding  to  a  message  from  the  company  commander 
when  he  was  tracking  items  on  the  map.  These  errors  are  consistent  with  IMPRINT  model 
predictions  (Mitchell  and  Brennan,  in  review;  Mitchell,  2005)  of  the  impact  of  communications 
tasks  on  battlefield  awareness. 

4,1.3  Platoon  Leader  Vehicle  Crew  Chief 

In  comparison  to  the  platoon  leader’s  workload  ratings,  his  crew  chiefs  overall  workload  ratings 
were  lower.  The  crew  chief  participated  in  six  of  the  eight  missions.  For  four  of  the  six  missions 
he  participated  in,  he  reported  that  he  had  “more  than  enough  time  for  all  tasks”  or  that  he  had 
“nothing  to  do”  for  all  or  parts  of  the  missions.  During  these  four  missions,  he  was  doing  fire 
mission  and  communications  tasks.  The  fire  mission  tasks  consisted  mainly  of  searching  for 
potential  targets.  These  tasks  are  the  primary  tasks  typically  perfonned  by  gunners  of  combat 
platforms  and  are  included  in  the  IMPRINT  models  for  the  FCS  mounted  combat  system.  In  the 
ARL  researchers  analysis  of  the  MCS  models  (Mitchell,  2005;  Mitchell  et  ah,  2003)  they 
predicted  the  gunner  to  have  the  lowest  workload  because  his  primary  function  is  to  scan  for 
targets.  This  prediction  is  consistent  with  the  gunner’s  ISA  rating  for  low  workload  and  nothing 
to  do. 

For  his  first  mission  the  crew  chief  reported  he  had  “more  than  enough  time  for  all  tasks”  prior  to 
the  first  pause  and  “nothing  to  do”  after  the  second  pause.  “Nothing  to  do”  was  his  rating  for 
missions  2,  7,  and  8  as  well.  In  contrast  to  the  other  four  of  his  missions,  the  PLV  crew  chief 
reported  that  for  two  missions  all  of  his  “tasks  were  going  well.”  For  these  two  missions  he  did 
the  same  tasks  as  the  other  four  but,  in  addition,  he  controlled  the  unmanned  ground  vehicle. 
Therefore,  the  unmanned  asset  control,  probably  accounts  for  the  higher  self-report  workload 
rating. 

Similar  to  his  workload  ratings  that  were  lower  than  the  platoon  leader’s  ratings,  the  crew  chief 
rated  his  level  of  effort  during  his  missions  as  lower  than  the  platoon  leader’s.  Across  the  six 
missions,  he  reported  he  worked  “not  at  all  hard”  to  “somewhat  hard.”  Although  his  reported 
level  of  effort  was  lower  than  the  platoon  leader’s,  he  perceived  his  perfonnance  to  be  between 
“average  and  very  well”  and  he  was  reportedly  “somewhat”  to  “very  aware”  of  the  situation. 
Therefore,  he  rated  his  perceived  performance  and  awareness  as  consistent  with  the  platoon 
leader’s  self-reported  ratings  of  these  two  categories.  During  these  missions,  the  ARL 
researchers  recorded  that  he  neutralized  a  target  without  pennission  from  the  company 
commander.  On  the  other  hand,  they  recorded  that  he  assisted  the  platoon  leader  by  correcting 
incorrect  grids  the  platoon  leader  was  reporting.  The  first  observation  indicates  he  was  not 
performing  well  or  aware  of  the  situation  because  he  should  have  waited  for  the  company 
commander’s  pennission.  On  the  other  hand,  the  second  observation  indicates  that  he  was  more 
aware  of  the  correct  grids  than  his  platoon  leader  and  perfonning  very  well  and  very  aware. 

SAG  AT  data  SA  Tech  is  analyzing  will  provide  further  insight  on  the  actual  perfonnance  level 
of  the  crew  chief  during  his  missions. 


19 


4.1.4  Platoon  Leader  Vehicle  Driver 

The  ISA  ratings  the  driver  of  the  platoon  leader’s  vehicle  gave  for  his  missions  fluctuated 
between  “all  tasks  going  well”  and  “less  important  tasks  suffering.”  There  is  no  obvious  pattern 
in  the  observational  data  in  table  6  that  explains  the  variations  in  his  workload  ratings.  However, 
that  maybe  a  reflection  of  the  lack  of  actual  driving  required  by  the  scenario.  The  driver  did  not 
have  to  drive  the  vehicle  often  during  the  scenario.  Instead,  he  participated  in  voice 
communications  and  assisted  with  battle  tracking  and  fire  missions.  He  did  not  need  to  move  the 
vehicle  because  the  platoon  leader  could  move  the  unmanned  ground  vehicle  to  do 
reconnaissance  rather  than  moving  his  own  vehicle.  Mission  8  was  the  mission  during  which  the 
driver  actually  drove  the  vehicle  most  frequently.  This  mission  had  a  consistent  workload  rating 
which  of  “all  tasks  are  going  well.”  Whereas,  for  most  of  the  missions,  he  reported  his  level  of 
effort  as  “somewhat  hard,”  for  Mission  8,  he  rated  his  level  of  effort  as  “very  hard.”  He  rated  his 
performance  during  this  mission  as  between  “average”  and  “very  well”  and  reported  he  was 
“very  aware”  of  the  situation.  The  ARL  analysts  recorded  that  at  times  he  was  spinning  the 
vehicle  in  circles  to  relieve  boredom.  There  was  no  other  observable  pattern  to  his  perfonnance 
and  ratings. 

4.1.5  Platoon  Sergeant 

In  addition  to  the  traditional  platoon  sergeant  functions,  the  platoon  sergeant  controlled  and 
monitored  the  unmanned  aerial  vehicle  throughout  the  test.  The  traditional  functions  included 
functions  similar  to  the  other  positions  in  the  test.  Specifically,  he  perfonned  battle  tracking,  fire 
missions,  and  voice  communications.  His  workload  rating  remained  consistent  across  all  eight 
missions  with  him  reporting,  “All  tasks  are  going  well.”  He  rated  his  level  of  performance  as 
“somewhat  hard”  to  “between  somewhat  hard”  and  “very  hard”  for  most  of  the  missions. 

Mission  7,  however,  had  a  lower  level  of  effort  rating  of  between  “not  at  all  hard”  to  “somewhat 
hard.”  Indeed,  the  ARL  researcher  noted  that  he  fell  asleep  during  this  mission  and  this 
observation  supports  his  reported  lower  level  of  effort.  Despite  the  fact  he  had  fallen  asleep 
during  this  mission,  he  rated  his  performance  as  “average.”  In  addition,  he  rated  his  performance 
as  “average”  to  between  “average”  and  “very  well”  across  all  eight  missions.  He  rated  his 
awareness  as  between  “somewhat  aware”  and  “very  aware”  for  all  missions  except  mission 
seven  during  which  he  fell  asleep.  For  this  mission,  he  rated  his  awareness  as  between  “not  at  all 
aware”  and  “somewhat  aware.”  The  ARL  researchers  observed  him  falling  asleep,  which 
supports  his  rating  of  lower  awareness  of  the  situation. 

In  addition,  to  their  observations  of  the  platoon  leader  falling  asleep,  the  ARL  researchers  noted 
that  he  had  difficulty  monitoring  the  UAV  when  he  conducted  battle  tracking.  Specifically, 
either  he  located  something  on  the  map  on  his  display  or  he  monitored  the  flight  of  the  UAV. 
They  had  observed  similar  alternation  of  tasks  by  a  platoon  sergeant  during  the  Omni  Fusion  06 
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test  at  Fort  Knox.  During  this  test,  the  platoon  sergeant  either  completed  battle  tracking  tasks  or 
monitored  an  unmanned  armed  reconnaissance  vehicle  but  could  not  do  both  concurrently 
(Mitchell,  2007). 

In  addition  to  having  difficulty  battle  tracking  while  monitoring  the  UAV,  the  ARL  researchers 
observed  other  instances  where  the  platoon  sergeant  seemed  to  have  reduced  situation  awareness 
while  controlling  the  UAV.  Specifically,  he  did  not  notice  that  one  of  the  platoon  members  was 
driving  his  vehicle  over  a  simulated  unmanned  vehicle,  he  missed  a  threat  that  appeared  in  front 
of  the  vehicle,  and  he  missed  an  IED.  In  addition,  they  observed  that  he  made  several 
communications  errors  that  included  missed  communications  from  the  company  commander  and 
platoon  leader,  using  the  incorrect  call  signs  when  sending  communications,  missing  part  of  an 
order,  missing  grid  coordinates,  and  missing  a  target  engagement  message.  In  mission  7,  during 
which  he  was  falling  asleep,  the  company  commander  had  to  notify  him  that  he  did  not  have  the 
UAV  on  the  IED  as  ordered  by  the  commander.  During  the  next  mission,  he  crashed  the  UAV 
into  a  tree,  used  the  incorrect  call  signs  and  was  late  to  respond  to  messages  from  the 
commander.  His  observed  perfonnance  contradicts  his  self-reported  workload  ratings  of  all 
tasks  going  well,  as  well  as,  his  performance  and  awareness  ratings.  Falling  asleep, 
communications  problems,  and  unmanned  asset  control  problems  do  not  represent  average  to 
very  well  performance  or  between  somewhat  aware  and  very  aware.  Because  he  fell  asleep 
during  the  simulation,  it  is  possible  that  the  observed  perfonnance  decrements  were  due  to  task 
underload  rather  than  overload.  On  the  other  hand,  most  of  the  errors  occurred  when  he  was 
performing  unmanned  asset  operations  concurrent  with  other  tasks  which  indicates  that 
unmanned  asset  operations  was  contributing  to  overload. 

4.1.6  Platoon  Sergeant  Vehicle  Crew  Chief 

Similar  to  the  crew  chief  in  the  platoon  leader’s  vehicle,  the  crew  chief  in  the  platoon  sergeant’s 
vehicle  acted  as  gunner  and  perfonned  primarily  voice  communications  and  fire  mission  related 
tasks.  He  reported  “all  tasks  were  going  well”  and  that  he  had  “more  than  enough  time”  or 
“nothing  to  do”  for  most  of  the  missions.  He  reported  his  level  of  effort  varied  between  “not  at 
all  hard”  to  between  “not  all  hard”  and  “somewhat  hard”  for  the  majority  of  the  missions.  For 
two  missions  he  reported  he  worked  “somewhat  hard.”  He  rated  his  performance  as  either 
“average”  or  “very  well”  except  for  two  missions  for  which  he  rated  his  perfonnance  as  between 
these  two  ratings.  His  awareness  of  the  situation  he  rated  as  between  “somewhat  aware”  and 
“very  aware”  for  all  eight  missions.  The  ARL  researchers  observed  his  behavior  for  only  two  of 
the  eight  missions.  During  these  two  missions,  they  noted  that  he  seemed  bored  as  exhibited  by 
his  scanning  the  same  place  throughout  the  mission. 

4.1.7  Platoon  Sergeant  Vehicle  Driver 

The  driver  of  the  platoon  sergeant’s  vehicle  perfonned  driving  and  voice  communications  as  his 
most  frequent  functions.  However,  the  platoon  sergeant’s  vehicle  did  not  move  often  during  the 
test.  Some  of  the  driving  tasks  the  driver  performed  consisted  of  moving  the  vehicle  back  and 
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forth  in  the  same  place.  Reflecting  his  low  level  of  activity,  the  driver  rated  his  workload  as 
“nothing  to  do”  or  “more  than  enough  time  for  all  tasks.”  Similarly,  he  rated  his  level  of  effort 
as  “not  at  all  hard”  for  all  eight  missions.  Although  he  was  not  doing  a  lot,  he  rated  his 
performance  as  “very  well”  for  the  first  five  missions  and  “average”  for  the  last  three  missions. 
Furthermore,  for  all  but  the  last  mission,  he  rated  himself  as  “very  aware  of  the  situation.”  For 
the  last  mission,  he  rated  himself  as  between  “somewhat  aware”  and  “very  aware.”  The  ARL 
researchers  did  not  record  any  performance  errors  for  him. 

4,1.8  Communications  Data  Analysis 

Because  the  collection  of  voice  communications  for  the  ACASA  tool  was  the  primary  test 
objective,  there  was  detailed  voice  data  available  for  analysis.  This  voice  data  included  verbatim 
all  the  voice  communications  from  the  test.  From  this  voice  data  transcription,  the  ARL 
researchers  calculated  the  number  of  messages  each  platoon  member  sent  to  another  platoon 
member  and  the  length  of  each  of  theses  messages  as  shown  in  table  10.  They  can  use  this 
message  data  in  IMPRINT  models  to  provide  estimates  of  voice  traffic  within  a  platoon  that  does 
not  have  digital  capability. 

In  addition  to  calculating  the  frequency  and  times  of  messages  with  the  platoon,  the  ARL 
researchers  correlated  the  IMPRINT  times  predicted  for  a  platoon  member  to  speak  a  message 
with  the  actual  test  times  for  the  spoken  messages  as  shown  in  table  11.  By  squaring  the  r-value 
(0.649)  from  the  correlation,  they  could  calculate  the  coefficient  of  detennination.  This  value  is 
useful  in  explaining  how  much  variance  the  two  data  sets  share  as  the  percent  of  variance  in  the 
dependent  variable  explained  by  the  independent.  The  value  from  this  test,  0.649  squared, 
indicates  42.25%  shared  variance  between  actual  and  predicted  voice  communication  times. 

This  percentage  indicates  correlation  between  the  analytical  predictions  of  the  IMPRINT  speech 
micro-model  and  the  test  data.  They  also  noticed  the  high  usage  of  short  sentences  using  one  to 
seven  words  throughout  the  experiment.  This  indicates  that  soldiers  prefer  concise  verbal 
communications  rather  than  longer  ones  during  combat  operations.  Soldier  subject  matter 
experts  have  reported  to  the  ARL  researchers  that  they  prefer  short  communications  and  this  data 
confirms  their  report.  Based  on  this  test  data,  IMPRINT  analysts  can  select  workload  ratings  for 
one  or  two  words  as  inputs  for  communications  tasks. 


5.  Conclusions  and  Recommendations 


The  researchers  who  participated  in  this  test  had  several  concurrent  objectives  to  achieve. 
Unfortunately,  the  criteria  necessary  to  satisfy  the  primary  objective,  collecting  voice  data, 
interfered  with  data  collection  for  the  secondary  objectives,  verifying  workload  predictions. 
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Specifically,  throughout  the  test,  the  test  participants  were  giving  self-report  workload,  level  of 
effort,  performance,  and  awareness  ratings.  By  definition,  self-report  ratings  are  an  individuals 
own  estimate  and  are  subjective.  To  verily  these  ratings,  researchers  need  to  know  what  the 
individuals  were  doing  when  the  individuals  gave  the  ratings  and  the  actual  perfonnance  of  the 
individuals.  During  this  test,  however,  the  test  participants  gave  their  self-report  ratings  during 
pauses  when  it  was  unclear  what  they  had  been  doing  prior  to  the  pauses.  The  ARL  researchers 
written  observations  describe  what  the  test  participants  were  doing  in  the  segment  prior  to  the 
rating.  However,  because  what  they  were  doing  changed  throughout  the  segment  but  the  test 
participant  gave  only  one  self-report  rating,  it  is  unclear  what  activities  correlated  with  the 
ratings.  Therefore,  the  researchers  recommend  that  subsequent  experiments  have  workload  data 
collection  as  a  primary  objective. 

With  workload  data  collection  as  a  primary  objective,  the  researchers  could  make  sure  that  the 
workload  ratings  correlate  with  Soldier  activities.  They  could  do  this  by  developing  a  scenario 
that  controls  the  activities  the  Soldiers  perfonned  at  specific  times.  For  example  for  a  three- 
Soldier  MCS  crew,  a  segment  of  the  scenario  could  consist  of  driving  from  one  checkpoint  to 
another  checkpoint.  During  this  segment,  the  driver  drives  the  vehicle,  the  crew  chief  scans  for 
threats,  and  the  commander  battle  tracks.  When  the  scenario  is  paused  and  an  ISA  rating  given, 
the  researchers  would  know  the  major  functions  each  Soldier  was  perfonning  prior  to  the  pause. 
To  vary  workload  throughout  the  scenario,  the  researchers  could  add  functions  to  this  basic  set  of 
drive,  scan,  and  battletrack  to  other  segments  and  obtain  workload  ratings.  For  example,  another 
segment  of  the  scenario  could  have  the  vehicle  moving  from  one  checkpoint  to  another 
checkpoint  while  monitoring  an  unmanned  robotic  vehicle.  The  driver  drives,  the  crew  chief 
scans  for  threats  and  monitors  the  unmanned  vehicle  and  the  commander  battle  tracks. 

Following  this  procedure,  the  researchers  could  compare  the  crew  chiefs  workload  ratings  from 
each  segment  and  attribute  any  differences  to  the  addition  of  unmanned  vehicle  monitoring. 

Creating  scenario  segments  of  Soldier  functions  will  permit  better  assessment  of  workload  but  to 
connect  these  workload  ratings  to  performance,  the  researchers  will  need  to  add  performance 
metrics  to  each  segment  as  well.  For  example,  they  can  assess  the  driver’s  deviation  from  the 
route  in  each  segment  or  the  number  of  targets  identified  by  the  crew  chief,  or  the  commander’s 
correct  identification  of  friendly  locations  and  unmanned  asset  interventions  as  performance 
metrics.  These  performance  metrics  permit  the  researchers  to  correlate  any  changes  in  workload 
levels  across  segments  to  changes  in  performance. 
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